LOT: A Story-Centric Benchmark for Evaluating Chinese Long Text Understanding and Generation
نویسندگان
چکیده
Abstract Standard multi-task benchmarks are essential for developing pretraining models that can generalize to various downstream tasks. Existing natural language processing (NLP) usually focus only on understanding or generating short texts. However, long text modeling requires many distinct abilities in contrast texts, such as the of long-range discourse and commonsense relations, coherence controllability generation. The lack standardized makes it difficult assess these a model fairly compare different models, especially Chinese models. Therefore, we propose story-centric benchmark named LOT evaluating modeling, which aggregates two tasks generation We construct new datasets based human-written stories with hundreds words. Furthermore, release an encoder-decoder-based LongLM up 1 billion parameters. pretrain 120G novels generative including infilling conditional continuation. Extensive experiments show outperforms similar-sized substantially both LOT.
منابع مشابه
Extending and Evaluating a Platform for Story Understanding
We summarize recent developments in our platform for symbolically representing and reasoning over human narratives. The expressive range of the system is bolstered by the infusion of a large library of knowledge frames, including verbs, adjectives, nouns and adverbs, from external linguistic resources. Extensions to the model itself include alternate timelines (imagined states for goals, plans,...
متن کاملEvaluating Text Understanding Systems
The Naval Ocean Systems Center is extending the scope of previous efforts in the area of evaluating English text analysis systems and is seeking to refine the methodology in order to obtain per formance benchmarks on an informat ion ext ract ion task for recall , precision, overgeneration, and fallout for a variety of systems. The methodology is also intended to enable the collection of qualita...
متن کاملEvaluating Generative Models for Text Generation
Generating human quality text is a challenging problem because of ambiguity of meaning and difficulty in modeling long term semantic connections. Recurrent Neural Networks (RNNs) have shown promising results in this problem domain, with the most common approach to its training being to maximize the log predictive likelihood of each true token in the training sequence given the previously observ...
متن کاملMeaning-Centric Framework for Natural Text/Scene Understanding by Robots
In the past fifty years, efforts in classical AI have focussed on computerizing human intelligence. Naturally, computerized human intelligence is not a proof of machine or robot intelligence because the programs underlying computerized human intelligence are still made by humans. So far, there is no computer nor robot which is creative enough to master its own language and to compose a text exp...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Transactions of the Association for Computational Linguistics
سال: 2022
ISSN: ['2307-387X']
DOI: https://doi.org/10.1162/tacl_a_00469